7 research outputs found

    Data-Aware Scheduling Strategy for Scientific Workflow Applications in IaaS Cloud Computing

    Get PDF
    Scientific workflows benefit from the cloud computing paradigm, which offers access to virtual resources provisioned on pay-as-you-go and on-demand basis. Minimizing resources costs to meet user’s budget is very important in a cloud environment. Several optimization approaches have been proposed to improve the performance and the cost of data-intensive scientific Workflow Scheduling (DiSWS) in cloud computing. However, in the literature, the majority of the DiSWS approaches focused on the use of heuristic and metaheuristic as an optimization method. Furthermore, the tasks hierarchy in data-intensive scientific workflows has not been extensively explored in the current literature. Specifically, in this paper, a data-intensive scientific workflow is represented as a hierarchy, which specifies hierarchical relations between workflow tasks, and an approach for data-intensive workflow scheduling applications is proposed. In this approach, first, the datasets and workflow tasks are modeled as a conditional probability matrix (CPM). Second, several data transformation and hierarchical clustering are applied to the CPM structure to determine the minimum number of virtual machines needed for the workflow execution. In this approach, the hierarchical clustering is done with respect to the budget imposed by the user. After data transformation and hierarchical clustering, the amount of data transmitted between clusters can be reduced, which can improve cost and makespan of the workflow by optimizing the use of virtual resources and network bandwidth. The performance and cost are analyzed using an extension of Cloudsim simulation tool and compared with existing multi-objective approaches. The results demonstrate that our approach reduces resources cost with respect to the user budgets

    Distributed Load Balancing Model for Grid Computing

    No full text
    International audienceMost of the existing load balancing strategies were interested in distributed systems which were supposed to have homogeneous resources interconnected with homogeneous and fast networks. For Grid computing, these assumptions are not realistic because of heterogeneity, scalability and dynamicity characteristics. For these environments the load balancing problem is then a new challenge presently for which many research projects are under way. In this perspective, our contributions through this paper are two folds. First, we propose a distributed load balancing model which can represent any Grid topology into a forest structure. After that, we develop on this model, a load balancing strategy at two levels; its principal objectives : the reduction of average response time of tasks and their transferring cost. The proposed strategy is naturally distributed with a local decision, which allows the possibility of avoiding use of wide area communication network.La plupart des stratégies d’équilibrage de charge existantes se sont intéressées à des systèmes distribués supposés avoir des ressources homogènes interconnectées à l’aide de réseaux homogènes et à hauts débits. Pour les grilles de calcul, ces hypothèses ne sont pas réalistes à cause des caractéristiques d’hétérogénéité, de passage à l’échelle et de dynamicité. Pour ces environnements, le problème d’équilibrage de charge constitue donc, un nouveau défi pour lequel plusieurs recherches sont actuellement investies.Notre contribution dans cette perspective à travers ce papier est double: premièrement, nous proposons un modèle distribué d’équilibrage de charge, permettant de représenter n’importe quelle topologie de grille en une structure de forêt. Nous développons ensuite sur ce modèle, une stratégie d’équilibrage à deux niveaux ayant comme principaux objectifs la réduction du temps de réponse moyen et le coût de transfert de tâches. La stratégie proposée est de nature distribuée avec une prise de décision locale, ce qui permettra d’éviter le recours au réseau de communication à large échelle

    Data-Aware Scheduling Strategy for Scientific Workflow Applications in IaaS Cloud Computing

    No full text
    Scientific workflows benefit from the cloud computing paradigm, which offers access to virtual resources provisioned on pay-as-you-go and on-demand basis. Minimizing resources costs to meet user’s budget is very important in a cloud environment. Several optimization approaches have been proposed to improve the performance and the cost of data-intensive scientific Workflow Scheduling (DiSWS) in cloud computing. However, in the literature, the majority of the DiSWS approaches focused on the use of heuristic and metaheuristic as an optimization method. Furthermore, the tasks hierarchy in data-intensive scientific workflows has not been extensively explored in the current literature. Specifically, in this paper, a data-intensive scientific workflow is represented as a hierarchy, which specifies hierarchical relations between workflow tasks, and an approach for data-intensive workflow scheduling applications is proposed. In this approach, first, the datasets and workflow tasks are modeled as a conditional probability matrix (CPM). Second, several data transformation and hierarchical clustering are applied to the CPM structure to determine the minimum number of virtual machines needed for the workflow execution. In this approach, the hierarchical clustering is done with respect to the budget imposed by the user. After data transformation and hierarchical clustering, the amount of data transmitted between clusters can be reduced, which can improve cost and makespan of the workflow by optimizing the use of virtual resources and network bandwidth. The performance and cost are analyzed using an extension of Cloudsim simulation tool and compared with existing multi-objective approaches. The results demonstrate that our approach reduces resources cost with respect to the user budgets

    CO-ALLOCATION IN GRID COMPUTING USING RESOURCES OFFERS AND ADVANCE RESERVATION PLANNING

    Get PDF
    Computational  grids  have  the  potential  for  solving  large-scale  scientific  problems  using  heterogeneous  and  geographically distributed resources. However, a number of major technical hurdles must overcome before this potential can be realized. One problem that is critical to effective utilization of computational grids and gives a certain Quality of Service (QoS) for grid users is  the  efficient  co-allocation  of  jobs.  The  advance  reservation  technique  has  been widely  applied  in many  grid  systems  to provide QoS, however, it will result in low resource utilization rate and high rejection rate when the reservation rate is high. This work  addresses  those  problems  by  describing  and  evaluating  a  grid  resources  co-allocation  algorithm  using  resources providers offers and planning  the advance  reservations.  In our algorithm, a Metascheduler performs  job scheduling based on resources offers and use advance  reservation planning mechanism  to  reserves  the best  offers. Offers act as a mechanism  in which  resource  providers  expose  their  interest  in  executing  an  entire  job  or  only  part  of  it.  The  Metascheduler  selects computational resources based on best offers provided by  the resources; Meta-schedulers can distribute a  job among various clusters that are usually heterogeneous in order to speed up the job execution.  The main aims of our algorithm is to minimize the total time to execute all jobs (Makespen), minimize the waiting time in the global queue, maximize  the resources utilization rate and balance  the  load among the resources. The proposed algorithm has been compared with other scheduling schemes such as First Come First Served (FCFS), easy backfilling (EBF), Fit Processor First Served (FPFS) and a simple co-allocation algorithm without offers support (SCOAL). The proposed algorithm has been verified  through  an  extension  of GridSim  simulation  toolkit  and  the  simulation  results  confirm  that  the  proposed  algorithm allow us to achieve our goals by minimizing the Makespan and the waiting time, maximizing the resources utilization rate and load the balance among the resources
    corecore